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A  RETRIEVABLE  RECIPE  FOR  INVERSE  "t" 


Donald  P.  Gaver 
Naval  Postgraduate  School 
Monterey,  California 

and 

Karen  Kafadar 

Statistical  Engineering  Division 
National  Bureau  of  Standards 


1 .  INTRODUCTION 

Various  critical  values  (synonymous  with  percent  points  or 
evaluations  of  the  inverse  distribution  function)  of  the  classi¬ 
cal  Student's  t  distribution  are  frequently  useful  in  applied 
statistics.  Selected  such  values  are  of  course  widely  tabulated; 
see  Fisher  and  Yates  (1963) ,  and  Pearson  and  Hartley  (1976) ; 
the  latter  also  are  reproduced,  with  extensions  by  E.T.  Federighi, 
in  Abramowitz  and  Stegun  (1968)  .  In  certain  circumstances,  how¬ 
ever,  it  is  convenient  to  be  able  to  compute  "t"  percent  points 
directly,  accurately,  and  simply,  without  the  need  of  extensive 
tables  except,  perhaps,  a  normal (Gaussian)  table;  but  see  Section  5.  A 
simply  derived,  or  retrievable ,  computational  procedure  for  doing  so  is 
presented  in  this  paper.  It  can  be  carried  out  quickly  on  a  hand¬ 
held  calculator  and  has  been  programmed,  for  instance,  for  the 
TI-59,  the  TRS-80  and  the  HP-41C.  It  seems  that  the  accuracy  of 
the  numerical  values  obtained,  especially  at  usually  required 
levels  (e.g.,  95%) — but  also  at  much  more  extreme  ones--coupled 
with  the  ease  of  their  computation,  should  provide  a  tempting 
argument  for  their  wide  use. 

Several  similar  approximations  have  appeared  in  various 
journals  over  the  last  two  decades.  Among  the  most  successful  of 


these  is  that  derived  by  Peizer  and  Pratt  (1968),  hereafter 
abbreviated  PP: 

1 

tn  (a)  pP  =  {n  exp  [z2  (a)  (n  -  |-)  /  (n  -  ^  +  2]-n}2,  (1.1) 

where  a  is  the  right  single-tail  probability,  so  0  <  a  <  0.5. 
Approximations  based  on  asymptotic  expansions  appeared  earlier 
(Wallace  1958,  1959)  and  were  successful  for  moderate  degrees 
of  freedom  and  not-too-extreme  tail  areas.  Other  approaches 
have  involved  rational  functions  in  the  degrees  of  freedom 
(Gardiner  and  Bombay  1965,  Kramer  1966)  or  the  logistic  distribution 
(Mudholkar  and  Chaubey  19  75)  .  A  formula  due  to  Koehler  19  8  3  is 
based  on  a  novel  data-analytic  approach  to  the  t-tables,  pioneered 
by  Hoaglin;  let  tn(a)K  represent  Koehler's  values.  Further 
accurate  approximations  are  reviewed  by  Bailey  1980. 

Often,  suggested  approximations  are  either  simple  but  not 
terribly  accurate,  or  else  are  extremely  complicated,  involving 
many  coefficients.  The  present  approach  offers  both  simplicity 
and  a  high  degree  of  accuracy,  yielding  two  digit  accuracy  or 
better  for  moderate  degrees  of  freedom,  across  a  broad  range  of 
tail  areas.  We  call  it  a  retrievable  recipe  because  the  simple 
basic  idea  allows  it  to  be  rederived  quickly  when  needed. 

2.  DERIVATION 

Examination  of  an  extensive  table  of  Student's  t,  or  some 
mathematical  analysis,  shows  that  for  a  <  0.5  there  is  a  mono- 
tonically  increasing  transformation  that  stretches  a  Normal 
quantile  z(a)  into  a  Student's  t  quantile,  t  (a).  Let 


tn  ( a )  =  '4n(z(a)),  with  y  (•)  representing  the  transform. 

We  search  for  a  simple  approximation  to  :,*(•);  call  it  ^  (  • )  . 
By  definition, 
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-00  /2tt 


fcn(l)  2  -(n  +  1) 

/  C  (n)  (1  +  — )  2  dt  =  1-u  (2.] 
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where  C(n)  is  the  normalizing  constant.  Equivalently, 
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Differentiation  of  both  sides  with  respect  to  z  now  leads  to 


C (n)  (1  + 


Oz)2, 

n  ; 


(n+1) 
“2 - 


dip*  (z) 
dz 


(2.3 


Our  approximation  has  origin  in  the  fact  that  tn(a)  approaches 
z(a)  and  dij;*(z)/dz  1  as  n  becomes  large  for  fixed  a.  Conse¬ 
quently,  simply  allow  the  approximation  ipn(z)  to  satisfy 


C  (n)  (1  + 


ip  (z) 
yn 

n 


(n+1) 


(2.4 


2 

for  every  n.  Solving  (2.4)  for  'Pn(z)  leads  to  an  expression  of 
the  following  general  form: 


H  (n)  z  (a) 


t2(a) 


ip  ( z  ( a )  ) 


n  f K (n) e 


1 }  .  (2.5 


But  for  0.5,  i:  US  t  (  . )  •„ ,  ...  Kin)  -  1  for  all  r. .  in. 

order  to  determine  H(:i),  consider  matching  expectations  of  >.  anuo: 

variables.  On  the  left-hand  side  oi  (2.5),  E(t2)  =  Varlt  I 

n  n  n-2 

the  right-hand  side  requires  the  evaluation 
E [exp{H (n) Z2/2 ; ]  =  j  expi H (n) x 2/2 1 exp{ -z2/2 } //2n dz 

=  [1  -  H  ( n)  ]~1/2 

where  Z  is  a  unit  normal  random  variable.  Notice  also  that  this 

evaluation  may  be  recovered  easily  from  the  moment  generating 

2 

function  of  the  Xi  distribution  function.  Thus  for  second  moment 
matching, 

_1 

=  n  [  (1-H  (n)  )  2  -1] 


and  so 


H(n)  =  ( 2n  -  3)  /  ( n-1 )  2  .  (2.6) 

Our  suggested  first  approximation  is,  then, 

1 

tn  (“)  gk  =  rn  exp'  z2  (;t)  (n  -  3/2)  /  (n-1)  2  }  -  n]  2  .  (2.7) 

for  a  <  0.50.  Notice  that  this  expression  strongly  resembles  the 
Peizer-Pratt  approximation ,  but  has  a  somewhat  different  exponent. 
Numerical  examples,  displayed  later,  also  suggest  that  it  is  of 
acceptable  accuracy,  usually  being  somewhat  superior  to  that  of 
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Peizer  and  Pratt.  A  distinctive  feature  of  the  above  approxima¬ 
tion,  termed  GK(I)  for  short,  is  its  intuitively  appealing  and 
easily  recollected  derivation:  it  is  retrievable.  Note  that 
this  expression  is  convenient  for  simulating  t-values,  as  in  Ury 
(1980).  Iteration  of  the  expression  (i.e.,  replacing  z  by  t 
on  the  right-hand  side  of  (2.7))  yields  samples  from  even  longer' 
tailed  distributions;  such  may  be  useful  in  robustness  studies. 

3.  IMPROVING  THE  ACCURACY  OF  THE  APPROXIMATION 

/\ 

Before  numerically  comparing  the  accuracy  of  tn(a)pp  with 
GK(I),  we  consider  a  method  for  improving  the  accuracy  as  follows 
Let  us  assume  that  the  true  value  of  Student's  t  can  be  written 
as  in  (2.7)  but  with  a  slightly  different  tail  area;  i.e.,  with 
a*  a  function  of  a: 

1 

tn(a)  =  {n  exp[z(a*)2(n -|)/(n-l)2]  -n)2  .  (3.1) 

Upon  rewriting  (3.1),  we  see  that 

1 

a*  (n)  =  <t>  {  [  ln(  1  +  t2  (a)  /n)  ]  f  (n-1)  2/(n  -  |)  ]  }2  ,  (3.2) 

where  $  denotes  the  standard  Gaussian  cumulative  distribution 
function.  Now  Figure  1  shows  that  ln(a*(n)-rx)  is  roughly  linear 
in  ln(n),  for  several  values  of  a.  The  least  squares  estimates 
for  the  slope  and  intercept  for  a  few  values  of  u  are  shown  in 
Table  1.  A  typical  value  for  the  slope  is  taken  to  be  -1.86; 
the  intercept  behaves  like  -3+0.62(ln  a).  Thus 


So  our  improved  percent  point  should  be 


fcn  (a)GK(II) 


t  (a*) 
n 


(3.4) 


Note  that  the  adjustment  to  a  in  (3.3)  decreases  rapidly  as  n 

increases.  Of  course,  the  above  correction  is  empirical  and 

doubtless  can  be  further  improved.  Unfortunately,  it  is  not  easily 

retrieved  in  a  manner  analogous  to  the  derivation  of  t  (a)„WT>  . 

n  vjK  v  -L  ) 


4.  COMPARING  THE  APPROXIMATIONS 

Figure  2  compares  the  accuracy  of  the  three  approximations  (1.1), 
(2.7),  (3.4),  and  Koehler's  formula  as  a  function  of  x  =  -10  log 
(tail  area),  for  n  =  6,  10,  20,  30,  by  plotting  the  relative  error 

A 

[=(tn(a)  -  tn  (a) ) /tn (a) ] .  Notice  that  in  all  the  graphs,  the  simple 
approximation  given  by  GK(I)  (2.7)  is  slightly  better  than  that 
suggested  by  Peizer  and  Pratt.  Considerable  improvement  is  at¬ 
tained  using  the  adjusted  value  of  a  given  by  GK(II)  25  in  (3.4). 

A  few  values  of  each  approximation  are  tabulated  in  Table  2 
and  compared  with  the  true  percentage  points.  Notice  that,  while 
GK(II)  is  initially  worse  than  GK(I)  for  low  degrees  of  freedom, 
it  results  in  an  extra  digit  of  accuracy  for  moderate  n  and  extremely 
small  a.  In  fact,  GK(II)  yields  2-3  decimals  of  accuracy  for 
n  ^  10  over  the  entire  range  of  a  considered,  0.05  to  0.000001. 
Koehler's  formula  is  better  for  small  n  (n  =  4)  and  moderate  a 
(a  >  0.025),  and  is  about  the  same  as  GK(I)  and  GK(II)  when  n  is 


very  large  m  =  60) .  However,  the  choice  of  approximation  at  n  -  60 
is  possibly  academic,  as  many  users  would  be  satisfied  with  Gaussian 
percent  points  for  such  large  degrees  of  freedom.  In  brief, 

GK(II)  obtains  an  extra  digit  of  accuracy  for  extreme  tail  areas 
and  moderate  degrees  of  freedom.  Notice  that  the  correction  fac¬ 
tor  is  essentially  0  for  large  n,  so  there  is  no  advantage  of  GK  ( 1 1 ) 
over  GK  ( I )  for  n  greater  than,  say  30. 

All  approximations  requiring  z  (a)  used  formula  (26.2.23)  from 
AMS  55  (Abramowitz  and  Stegun  1968)  in  the  table  and  figures  of 
comparisons.  It  may  be  noted  that  the  approximation  GK(I) ,  (2.7) , 

may  be  inverted  to  determine  approximate  probability  values  (so- 
called  "p-values") .  A  table  of  the  Gaussian  distribution,  or  an 
approximation  to  the  Gaussian  percent  points,  is  required. 

5 .  TOWARDS  A  SIMPLE  STAND-ALONE  APPROXIMATION 

It  is  tempting  to  calculate  our  t-value  approximations,  which 
depend  upon  tabulated  normal  values,  with  the  aid  of  approximate 
normal  values  that  can  be  computed  easily  from  scratch.  The 
result  is  a  stand-alone  t-value  approximation,  accurate  to  nearly 
two  digits  over  a  surprisingly  large  range. 

Here  is  a  suggested  way  of  proceeding.  Tukey ' s  A-distribut ion 
(see  Tukey  1970,  as  referred  to  in  McNeil  1977,  p.  88)  provides 

z  (ct)  =  i"1  (l-2a;  A)  =  ( /F7 22A/2  A)  l  ( 1-a)  A  -  aA  ]  ;  (5.1) 

with  A  =  0.14  it  yields  inverse  normal  values  to  3-digit  accuracy 
down  to  a  =  0.01.  In  order  to  extend  fairly  satisfactorily  to 
a  =  10  proceed  as  follows:  put  a  =  10  u  and  write 


/  exp{-z  /2  }/v'2tt  dz  =  10  , 

z  (u) 


-  u  In  10  = 


In  j  exp{-z  /2}//2tT  dz 
z  (u) 


Now  differentiate,  and  examine  the  result  as  u  becomes  large 
(cf.  Feller  1957,  p.  193) : 


In  10  = 


1  ,  .  2 
~2Z  u 


00  --=-Z 

/  e  dz 


dz  (u) 


1  ,  ,  2 
~2Z ^u) 

* 

-jz  (u) 


dz  ( u) 
2  du 


z  (u) 


=  z  (u) 


dz  (u) 


Integration  gives  (for  "large"  u,  here  2  <  u  <  6) 


'z  (uj  +  2  (In  10)  (u  -  uQ) 


(5.5 


Take  Uq  =  -log(O.Ol)  =  2,  z(uQ)  =  zT(0.01)  =  2.58  and  replace 
2  In  10  by  4.32  to  achieve  slightly  better  results.  Then  utilize 
these  numbers  to  find,  for  a  <  0.01 


ZT  (a)  = 


zZ  (0.01)  +  4 . 32 ( -  log  (2a)  -  2) 


(5.6) 


In  summary,  use  the  following  prescription  for  the  normal 
values: 


■ 


with  close  to  2-digit  accuracy  throughout  the  stated  range. 
Refinement  or  improvement  is  possible,  but  at  the  apparent 
price  of  a  more  elaborate  representation. 

Table  2  includes  t-values  computed  using  the  normal  approx 
mation  (5.7).  These  are  labelled  fcn  (a)  GK  (  m)  • 
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COMPARISON  OF  APPROXIMATIONS 


Table  1 

Linear  fits  of  ln(»*-  a)  vs  ln(n) 


a 

Slope 

Intercept 

.05 

-2.876 

-3.447 

.02 

-1.308 

-8.487 

.01 

-1.809 

-6.495 

.005 

-1.828 

-6.473 

.001 

-1.928 

-6.800 

.0005 

-1.943 

-7.138 

.00005 

-1.930 

-8.683 

.00001 

-1.927 

-9.872 

.000005 

-1.822 

-10.664 

.000001 

-1.771 

-11.976 

>>:-/-- \-.\w 


-**.  v'V.  ^-*-Q  v' 


Single  tail 
(-  10Log( tail 

n  =  4 

True 

K 

PP 

GK(  I ) 

GK(II) 

GK(III) 

n  =  10 
True 
K 

PP 

GK  ( I ) 

GK(II) 

GK(III) 

n  =  20 
True 
K 

PP 

GK(  I ) 

GK(II) 

GK(III) 

n  =  30 
True 
K 

PP 

GKC I ) 

GK(II) 

GK(III) 

n  =  60 
True 
K 

PP 

GK(  I ) 

GK(II) 

GK(III) 


Comparing  approximations 


area  .05 

.025 

.01 

.005 

.001 

.  1 

area)  (13) 

(16) 

(20) 

(23) 

(30) 

2.  132 

2.776 

3.747 

4 .604 

7.171 

1 1 

2.139 

2.776* 

3. 708 

4.509 

6.853 

12 

2.  134* 

2.787 

3-780 

4.667 

7.379 

13 

2.  118 

2.763 

3-741* 

4.613* 

7 . 2o6 

13 

2.107 

2.748 

3- 716 

4.575 

7.165* 

13 

2.134 

2.790 

3-773 

4.628 

7.402 

13 

1.812 

2.228 

2.764 

3-169 

4.144 

5 

1 .823 

2.242 

2.778 

3.182 

4  .  147* 

5. 

1.813 

2.230 

2.767 

3- 174 

4.155 

5. 

1.812* 

2.229* 

2.766 

3-173 

4  .  153 

5. 

1.811 

2.227* 

2.764* 

3. 170* 

4 .  147* 

5  ■ 

1.824 

2.245 

2.781 

3.179 

4.196 

5 . 

1.725 

2.086 

2.528 

2.845 

3.552 

4  . 

1.728 

2.090 

2.531 

2.847 

3-552* 

4. 

1.725* 

2.087 

2.529 

2.847 

3.554 

4. 

1.725* 

2.087 

2.529 

2.647 

3-554 

4. 

1.725* 

2.086* 

2.528* 

2.840* 

3.553 

4. 

1 .735 

2.100 

2.541 

2.851 

3.583 

4. 

1.697 

2.042 

2.457 

2.750 

3.385 

4. 

1.697* 

2.042* 

2.455 

2.746 

3.379 

4. 

1.697* 

2.043 

2.458* 

2.751* 

3-386* 

4. 

1.698 

2.043 

2.458* 

2.751* 

3.386* 

4. 

1.697* 

2.043 

2.458* 

2.751* 

3.386* 

4. 

1.707 

2.056 

2.470 

2.755 

3-412 

4. 

1.671 

2.000 

2.390 

2.660 

3.232 

3. 

1.668 

1.996 

2.383 

2.650 

3.218 

3- 

1.671* 

2.001* 

2.391* 

2.661* 

3.232* 

3- 

1 .668 

1.996 

2.383 

2.650 

3.218 

3- 

1.668 

1.996 

2.383 

2.650 

3.218 

3. 

1.680 

2.013 

2.401 

2.665 

3.255 

3- 

0CL1 

(ij 


•  6 5  9 

.  ;ct* 

»  (  3*  G 

.  o  1  0 

.09  1 

•  o  a  o 


.  094 
0o4 
7  2  1 
71 8 
70  1* 
7  o2 


5o9 

553 

543 

542 

540* 

580 


234 

239 

236* 

236* 

236* 

267 


962 

953 

963* 

953 

953 

989 


*  indicates  closest  approximation  to  true  value 
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